An Open Source Tool for Partial Parsing and Morphosyntactic Disambiguation

نویسندگان

  • Adam Przepiórkowski
  • Aleksander Buczyński
چکیده

This article presents a formalism and an open source implementation of a new tool for simultaneous partial parsing and morphosyntactic disambiguation and correction. We argue that, contrary to the common pipeline approach, where morphosyntactic tagging is fully accomplished before shallow or partial parsing, both tasks are best approached in parallel. This has been suggested before, and formalisms which allow for the interweaving of partial parsing and morphosyntactic disambiguation have been proposed. Our approach is novel in that a fully uniform formalism is presented, and a single grammar rule may contain structure-building operations, as well as morphosyntactic correction and disambiguation operations. The formalism has been implemented in Java and is now available under the GNU General Public License.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

spade Demo: An Open Source Tool for Partial Parsing and Morphosyntactic Disambiguation

The paper presents Spejd, an Open Source Shallow Parsing and Disambiguation Engine. Spejd (abbreviated to ♠) is based on a fully uniform formalism both for constituency partial parsing and for morphosyntactic disambiguation — the same grammar rule may contain structure-building operations, as well as morphosyntactic correction and disambiguation operations. The formalism and the engine are more...

متن کامل

Spejd: A Shallow Processing and Morphological Disambiguation Tool

This article presents a formalism and a beta version of a new tool for simultaneous morphosyntactic disambiguation and shallow parsing. Unlike in the case of other shallow parsing formalisms, the rules of the grammar allow for explicit morphosyntactic disambiguation statements, independently of structure-building statements, which facilitates the task of the shallow parsing of morphosyntactical...

متن کامل

An Implementation of Combined Partial Parser and Morphosyntactic Disambiguator

The aim of this paper is to present a simple yet efficient implementation of a tool for simultaneous rule-based morphosyntactic tagging and partial parsing formalism. The parser is currently used for creating a treebank of partial parses in a valency acquisition project over the IPI PAN Corpus of Polish.

متن کامل

Verbal Morphosyntactic Disambiguation through Topological Field Recognition in German-Language Law Texts

The morphosyntactic disambiguation of verbs is a crucial pre-processing step for the syntactic analysis of morphologically rich languages like German and domains with complex clause structures like law texts. This paper explores how much linguistically motivated rules can contribute to the task. It introduces an incremental system of verbal morphosyntactic disambiguation that exploits the conce...

متن کامل

JoBimText Visualizer: A Graph-based Approach to Contextualizing Distributional Similarity

We introduce an interactive visualization component for the JoBimText project. JoBimText is an open source platform for large-scale distributional semantics based on graph representations. First we describe the underlying technology for computing a distributional thesaurus on words using bipartite graphs of words and context features, and contextualizing the list of semantically similar words t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007